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Ranked  Set  Sampling 


•  A  sampling  design  where  expert  judgment  (or 
simple  observation)  is  used  in  combination 
with  simple  random  sampling 

•  Simple  random  sampling  is  used  to  create  a 
large  number  of  potential  samples.  The  expert 
then  ranks  these  potential  samples  and  selects 
which  to  send  for  analysis 


Statistician  &  Expert 


•  Statistician:  Selects  "m"  sets  of  random  samples  of 
size  "m"  as  potential  samples  (total  “m  x  m  ) 

•  Expert:  Within  each  set,  the  expert  ranks  (grades) 
the  potential  samples  from  highest  to  lowest  based 
on  the  expert's  opinion 

•  Together:  From  the  first  set,  the  largest  is  chosen; 
from  the  second  set,  the  second  largest  is  chosen; 
from  the  third  set,  the  third  largest  is  chosen,  etc. 

•  Result:  A  "super-sample"  of  size  “m” 


Expert  opinion  can  vary 


•  Qualitative:  Visual  inspection 

-  biomass  volume 

-  surface  soil  color 

-  seedling  counts 

-  heights  of  bushes 

•  Quantitative:  Auxiliary  data 

-  historical  data 

-  on-site  detectors 

-  pH  meter 

-  portable  equipment 


How  RSS  works 


•  Total  of  “m  x  m  potential  samples  will  be 
identified  and  divided  into  “m”  sets,  each 
containing  “m” 

•  One  out  of  the  “m”  potential  samples  in  each  set 
will  be  sent  for  analysis,  the  rest  discarded 

•  For  example,  if  we  decided  to  send  20  samples  to 
the  laboratory  for  analysis  then  20  x  20  =  400 
potential  samples  would  have  to  be  identified 


The  3-Step  RSS  method 


•  Randomly  identify  “m2”  sample  units  using 
simple  random  sampling  and  allocate  them  into 

it  99  4  r  ■  it  99 

m  sets  of  size  m 

•  Rank  the  units  within  each  set  using  the  auxiliary 
variable  selected  by  the  expert 

•  Select  the  m  units  to  be  sent  for  analysis  using 
this  pattern: 

-  from  Set  1,  select  the  unit  with  rank  1 

-  from  Set  2,  select  the  unit  with  rank  2 
...and  so  on  until... 

-  from  Set  m,  select  the  unit  with  rank  m 
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Relative  Precision 


How  good  is  this  Ranked  Set  Sample? 


* 


* 


* 


* 


■  Uniform 
*  Normal 
A  Exponential 
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Number  in  Set  (m) 


2 


4 


8 


10 
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RSS:  The  Big  Advantage 


Define  “advantage”  as  relative  precision: 

Relative  Precision  =  Variance  of  a  random  sample 

Variance  of  ranked  set  sample 

the  more  this  exceeds  1,  the  better  RSS 

For  example:  Using  the  preceding  graph,  If  the 
distribution  is  Normally  distributed,  a  RSS  of  8  has 
the  effectiveness  of  a  simple  random  sample  of  size 
8  x  4.1  =  32.8  i.e.  sample  of  33 

Downside  is  the  cost  of  sample  selection: 

-  Need  to  identify  8  x  8  =  64  samples 

-  Need  to  rank  the  8  sets  of  8  by  some  selected  variable 

J  8 


Effect  of  Imperfect  Rankings 


Suppose  the  data  are  approximately  Normal: 

•  Small  errors  or  confusions:  Relative  Precision 
declines  up  to  approximately  25% 

•  Large  errors  or  confusions:  Relative  Precision 
declines  up  to  approximately  50% 

•  But  good  news!  Even  in  the  worst  scenario  the 
Ranked  Set  Sample  is  the  same  as  a  Simple 
Random  Sample  so  there  is  no  loss  in  statistical 
power 
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Can't  Identify  m2  samples 


•  Decide  how  many  samples  it  is  feasible  to  send  to 
the  laboratory  (n) 

•  Decide  how  many  RSS  samples  can  be  properly 
identified  at  a  time  (m) 

•  Use  ordinary  RSS  with  the  “m”  identified  samples 
and  repeat  enough  times  to  reach  “n” 

•  For  example,  can  afford  to  send  20  samples  for 
analysis  and  can  rank  up  to  4  units,  therefore  take 
5  cycles  of  4  RRS 
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Lognormal  distribution  of  data 


•  Use  unequal  cycle  allocation  depending 
on  the  shape  of  the  distribution 


•  More  samples  are  taken  from  the 
higher  ranked  values  than  the  lower 
ranked  values. 


11 


Unequal  Cycle  Allocation:  Example 


For  equal  allocation,  12  samples  needed  i.e.  3  cycles  of 
4  sets.  This  distribution  is  skewed  (lognormal)  in  favor 
of  high  values,  thus  unequal  allocation  is  needed 
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Notice  our  sample  of  12  consists  of  3  small,  4  medium,  and  5  large  12 
reflecting  the  known  skewness  of  the  population 


Effect  of  unequal  allocation 


•  Good  news:  Even  more  effective  than  equal 
allocation  (which  was  already  much  better  than 
random  sampling) 

•  But  there's  a  cost: 

-  Extra  effort  to  create  the  "unequalness"  in  a 
meaningful  way  (how  many  extra  high  ones, 
how  many  low  ones?) 

-  Must  adjust  the  way  to  calculate  mean  and 
variance  (need  to  call  in  a  statistician) 
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How  does  it  affect  statistical  tests? 


•  If  you  assume  the  RSS  is  just  like  a  random 
sample  (which  it  is  if  RSS  is  ineffective)  then  the 
result: 

-  For  decision-making 

~  Smaller  false  acceptance/rejection  rates 
~  Smaller  "grey  region"  of  uncertainty 

-  For  estimation 

~  More  accurate  answers 
~  Less  chance  of  error 
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Ranked  Set  Sampling  :  Conclusions 


•  Pros 

-  Better  representativeness  through  using  experts 

-  Better  precision  than  Random  Sampling 

-  Same  simple  formulae  to  use 

•  Cons 

-  Increased  cost  of  the  expert  ranking  samples 

-  Difficulty  quantifying  exact  improvement 

-  Need  to  find  best  variable  to  do  the  ranking  on 

...but  the  Pros  definitely  outweigh  the  Cons! 
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Further  advice  on  Ranked  Set  Sampling 


•  Guidance  on  Choosing  a  Sampling  Design 
for  Environmental  Data  Collection  QA/G-5S 
(www.epa.gov/quality) 
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